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1.   INTRODUCTION 

In  this  paper  we  shall  describe  the  functional  characteristics  of  a 
program  for  scanning  digitized*  bubble- chamber  negatives  which  is  in  an 
advanced  stage  of  realization  in  this  Laboratory  using  an  IBM  7090  computer. 
Figure  1  illustrates  how  the  processing  carried  out  by  this  program  fits  into 
the  over-all  scheme  for  the  automatic  analysis  of  bubble- chamber  data. 
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Figure  1.   Analysis  Scheme  for  Bubble- Chamber  Data, 

Of  the  six  stages  shown  above,  this  paper  is  concerned  primarily  with 
the  scanning  phase.  We  shall  explain  in  the  next  section  the  input-output 
features  of  this  phase  and,  briefly,  the  processing  technique  employed  to 
generate  this  information  from  the  digitized  input  picture. 

In  the  preprocessing  stage  a  certain  amount  of  noise  cleaning  is 
performed  on  the  output  of  the  digitizer  to  match  this  image  to  the  input 


The  digitization  is  not  coordinate  digitization  but  the  conversion  of  the 
negative  into  a  raster  of  black-white  dots  which  are  stored  in  the  computer 
as  l's  and  O's  respectively.   See,  for  example,  the  digitized  images  shown 
in  Fig.  k   below. 


-1- 


requirements  of  the  scanning  program.   This  noise  cleaning,  which  consists  of 
gap  filling  and  thinning  of  the  tracks  of  the  digitized  image,  is  done  at  a 
purely  local  level.   In  other  words,  one  would  not  attempt  at  this  stage  to 
connect  up  "large"  gaps  in  the  tracks  or  to  clean  up  extended  black  areas  in 
the  scanning  field.   These  will  be  dealt  with,  along  with  other  ambiguities 
encountered  while  scanning,  in  the  post-editing  stage.   Some  general-purpose 
preprocessing  routines  have  been  developed  and  a  few  processed  images  that  have 
been  obtained  using  these  routines  are  illustrated  in  Sec.  3  below. 

The  noise  cleaning  at  the  local  level  referred  to  above  and  portions 
of  the  scanning  program  discussed  in  the  next  section  will  ultimately  be  realized 
using  the  parallel  processing  unit—called  the  Pattern  Articulation  Unit  (PAU)-- 
currently  under  fabrication  in  this  Laboratory.   An  IBM  7090  simulator,  termed 
PAX,  for  a  general-purpose  parallel  processing  computer  has  been  written  [5]. 
In  the  scanning  program  described  in  the  sequel,  whenever  the  processing  intended 
is  in  the  PAU  mode,  it  is  currently  being  realized  using  PAX. 


2.   THE  SCANNING  PROGPAM 

2.1  Preliminaries 

The  program  is  set  up  to  scan  bubble -chamber  negatives  one  view  at  a 
time.   Comparative  analyses  between  the  separate  views  will  be  explicitly 
relegated  to  a  higher  level  post-editing  program..   The  input  to  this  latter 
program  will  be  the  outputs  from  the  independent  scanning  of  the  three  component 
views  separately.   Keeping  this  in  mind  we  seek  to  generate  in  each  one  of 
these  outputs  adequate  peripheral  information  so  as  to  make  the  post-editing 
simple  and  efficient.   For  the  purposes  of  our  discussion  in  this  section  we 
shall  assume  that  the  input  picture  to  the  scanning  program,  is  digitized,  and 
suitably  preprocessed. 

2.2  The  Output  Features 

Starting  with  such  an  input  picture,  the  scanning  program  is  set  up 
to  generate  two  output  lists:   (l)  a  TRACK  LIST  and  (2)  a  VERTEX  LIST.   (Here 
and  in  the  rest  of  this  paper,  track  and  vertex  are  used  as  generic  names .   Only 
those  interaction  points  involving  two  or  more  tracks  are  listed  as  vertices. 
All  the  rest  are  grouped  under  terminals . ) 
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The  VERTEX  LIST  consists  of  a  set  of  vertex,  names,  each  vertex  name 
pointing  to  a  list  of  track  names  associated  with  that  vertex.   (in  the  next 
subsection  we  consider  the  data  structure  in  greater  detail  and  explain  the 
actual  format  of  these  list  structures  as  compiled  in  the  IBM  7090. )  All  other 
tracks  in  the  picture  not  associated  with  any  vertex  are  listed  by  name  in  the 
TRACK  LIST.   Thus  the  two  lists  are  mutually  disjoint. 

Each  track  name  and  vertex  name  carries  with  it  other  subsidiary 
information  for  further  classification  of  these  (at  the  post-editing  level)  into 
V-type  vertices,  stars,  beam  tracks,  spirals  and  so  on.   In  the  program,  to  be 
described  this  peripheral  information  is  computed  according  to  the  following 
s  cheme : 

Tracks.   Each  track  name  is  associated  with  a  list  of  points  P. 
1 

(identified  by  their  coordinates  x..  y.  )  which  together  define  the  track 
spatially  in  the  picture.   (We  shall  describe  later  how  these  points  P.  are 
actually  chosen.)  Each  track  name  contains  three  kinds  of  local  descriptions : 

1.  end-point  information:   interior  terminals,  wall  terminals 
(if  wall,  which  wall),  vertex  terminals,  etc. 

2.  curvature  information:   this  is  a  qualitative  classification 
into  one  of  four  groups  0,  1,  2,  3  in  the  ascending  order 

of  curvature,  together  with  the  specifications  whether  the 
sense  is  positive  or  negative,  based  on  some  fixed 
convention. 

3.  length  information:   this,  for  the  present,  is  restricted 
to  a  measure  of  the  total  number  of  points  listed  in  the 
track.   Since  our  point  listing  is  done  on  a  systematic 
basis,  a  qualitative  idea  of  the  length  can  be  easily 
inferred  from  this  measure. 

Each  track  name  is  further  classified  globally  into  one  of  four  types 
as  follows : 

TYPE  I:    All  nonvertex  tracks  (i.e.,  not  associated  with  any 
vertex)  in  the  curvature  groups  0,  1  and  with  at 
least  one  end  point  on  a  wall.   Noninteracting  beam 
tracks  should  belong  to  this  type. 


TYPE  II:   All  nonvertex  tracks  in  the  curvature  groups  0,  1 
but  with  both  end  points  interior  to  the  chamber. 
Fragments  of  beam  tracks  and  possibly  of  vertex 
prongs  should  belong  to  this  type. 

TYPE  III:   All  tracks  with  at  least  one  end  point  on  a  vertex. 

TYPE  IV:   All  nonvertex  tracks  in  the  curvature  groups  2,3» 
Spirals,  helices,  etc.  should  belong  to  this  type. 

Vertices .   Vertex  names  carry  two  types  of  descriptive  information: 
(l)  (x,y)  coordinates  of  their  position  in  the  picture  and  (2)  associated  labels 
which  give  information  about  the  local  orientation  of  tracks  ending  on  the 
vertices.   From  the  number  of  associated  labels  one  can  readily  compute  the 
number  of  tracks  ending  on  the  vertices. 

An  ideal  scanning  program  should  identify  only  those  points  as  vertices 
which  represent  actual  events  in  the  chamber.   It  is  clear,  however,  that  any 
operating  program  can  only  hope  to  approximate  to  this  ideal;  particularly  if 
scanning  is  restricted  to  one  view  at  a  time.   Our  compilation  techniques  are 
organized  so  that  ambiguities  between  cross-overs  and  vertices  can  be  resolved 
to  a  reasonable  extent  by  reference  to  the  currently  available  global  informa- 
tion about  the  track  segments  involved.   Our  policy  is  however  to  fail  safe  by 
creating  a  new  vertex  whenever  in  doubt.   In  addition  we  shall  also  carry  along 
information  about  the  cross-overs  and  the  bends  in  the  tracks.   The  post-editing 
program  acting  on  all  this  auxiliary  information  and  the  compiled  information 
for  tracks  and  vertices  described  earlier  from  the  three  views  of  a  stereotriad, 
checks  each  listed  vertex  to  see  whether  in  fact  it  is  a  vertex  and  if  so 
classifies  it  according  to  its  event  type. 

2.3  Data  Structure 

All  the  data  storage  is  in  the  form  of  lists  and  sublists  with  well- 
defined  hierarchic  structures.  These  lists  are  made  up  of  four  distinct  cell 
types  each  with  its  own  internal  structure.   These  are  the  A,  B,  C  and  E  cell 
types.  We  shall,  in  this  section,  describe  these  briefly  and  show  how  the  track 
and  vertex  lists  are  built  out  these  and  stored  within  the  IBM  7090  during  the 
execution  of  the  scanning  program. 

In  the  IBM  7090,  each  cell  is  represented  by  a  machine  word.   The  36 
bits  of  each  cell  are  divided  into  three  fields  of  12  bits  each.  We  shall  refer 


to  these  as  the  field  1,  2  and  3.   Field  3  is  invariably  usedto  link  two 
consecutive  cells  at  the  same  level  in  a  list  structure.   The  other  two  fields 
serve  different  functions  in  the  several  cell  types  as  shown  later  in  this 
section. 

All  our  primitive  lists  are  linear  strings  and  in  all  our  data 
storage  we  do  not  use  more  than  a  three-level  hierarchic  structure.   At  each 
level,  a  linear  string  is  named  by  a  TRAILER.   The  trailer  can  be  thought  of  as 
the  head  of  a  list  since  all  links  point  away  from  the  trailer.   The  cell 
immediately  following  a  trailer  in  a  list  will  be  referred  to  as  the  NEAREND 
and  the  cell  at  the  other  end,  the  FAREND.   We  shall  assume  that  some  distinct 
end  symbol  is  used  to  terminate  a  linear  string. 

The  function  and  structure  of  each  cell  type  can  now  be  described.   A 
cells  are  used  exclusively  as  trailers  of  track  lists/  C  cells  exclusively  as 
trailers  of  vertex  lists.   B  cells  form  the  components  (i.e.,  the  body)  of  a 
track  list.  Finally  E  cells  are  used  exclusively  by  the  scanning  program  for 
temporary  storage  of  track  names  during  the  course  of  the  processing.   In  the 
track  and  vertex  trailers  (i.e.,  the  A  and  C  cells)  are  stored  all  the  compiled 
information  about  the  tracks  and  associated  vertices.   Because  of  the  restricted 
lengths  of  the  machine  word  in  the  IBM  7090*  it  becomes  necessary  to  allocate 
two  machine  words  for  each  of  these  trailers.   The  format  of  the  A,  B  and  C 
cells  is  shown  below: 


TRACK  DETAILS 
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(B-LINK) 
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X-Y  COORDINATE 
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The  significance  of  the  track  details  and  the  labels  (which  constitute  the 
descriptive  compiled  information  for  the  tracks  and  vertices,  respectively}  has 
already  been  explained  in  Sec.  2.2  above.   The  term  "window"  will  become  clear 
when  we  describe  the  actual  scanning  procedure  in  the  next  section.   In  the 
B  cell,  along  with  the  X-Y  coordinates,  we  also  store  information  concerning 
bends  and  cross-overs  in  the  track. 

It  is  now  easy  to  visualize  the  stored  data  structure  at  any  given 
stage  in  the  processing.   The  first  configuration  below  illustrates  two  tracks 
linked  in  the  track  list.   The  second  illustrates  two  vertices  linked  together 
in  the  vertex  list.   The  circles  indicate  list  terminations. 


TRACK  LIST  ^A->B->B-»B-B-*B  -»(B 

©-  B  -® 

VERTEX  LIST  -  C  ->  A  -  B  - 

AVfB 


(£)-»  A  -»  B  -  B 


A  ■*   B  -»  B  -  I 
A)-  B  -*(B 


It  is  important  to  note  the  the  information  is  stored  without  duplication.   At 
any  given  time  a  particular  named  track  will  be  found  stored  at  only  one  place. 
If  it  has  been  completely  processed,  it  will  be  found  in  the  TRACK  LIST;  other- 
wise it  will  be  found  in  some  E  list.   If,  however,  a  track  is  associated  with 
a  vertex,  it  will  only  be  listed  in  the  C  list  corresponding  to  that  vertex. 
Thus  the  scanning  proceeds  transferring  names  of  lists  from  one  place  to.  another, 
till,  ultimately,  all  single  tracks  end  up  in  the  TRACK  LIST,  all  vertex  tracks 
in  some  C  list  and  the  vertices  themselves  in  the  VERTEX  LIST. 

2 .k     The  Scanning  Procedure 

Having  discussed  the  input-output  features  of  the  scanning  program 
and  its  data  structure,  we  are  finally  in  a  position  to  describe  its  actual 
operational  details.   Because  of  the  complexity  of  its  details,  only  a  schematic 
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outline  will  be  presented  here.   More  complete  accounts  of  the  various  units 
which  comprise  this  program  will  be  found  in  the  references  listed  at  the  end 
of  this  paper. 

In  the  scanning  procedure  employed,  the  digitized  input  pictures  is, 
to  begin  with,  partitioned  into  a  network  of  windows  of  a  fixed  size  as  shown 
in  Fig.  2.   Each  window  represents  a  square  raster  of  size  32  x  32  bits.   This 
window  size  is  determined  by  the  design  features  of  the  PAU  which  will 
ultimately  be  used  to  realize  parts  of  the  algorithms  used  in  the  scanning 
process.   The  total  number  of  windows  will  of  course  be  determined  by  the  fine- 
ness of  digitization  used  and  this  again  depends  upon  the  ultimate  resolution 
required.   On  the  basis  of  some  preliminary  estimates  made  using  72-inch  hydrogen 
bubble-chamber  exposures  at  15  il  demagnif ication,  it  is  our  view  that  a  parti- 
tioning of  a  picture  into  roughly  15  windows  across  and  60  windows  along  its 
length  provides  adequate  resolution  to  collect  the  sort  of  information  discussed 
here. 


PAST 


FUTURI 


Current 
window 

32  x  32 
bits 


Past -Future 
Boundary 


Figure  2.   Partitioning  into  Windows  (Schematic) 

Our  technique  is  to  process  windows  sequentially  starting  with  the 
lower-left  bottom  corner  and  proceeding  row  by  row  (from  left  to  right)  till  the 
top-right  corner  is  reached.   Within  each  window  we  go  through  an  iterative 
procedure  (l)  to  update  any  of  the  tracks  so  far  named  and  compiled  which  enter 
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this  window^  (2)  to  identify  and  name  new  tracks  which  originate  from  this 
window.   Similar  considerations  hold  for  the  vertices.   To  update  is  to  list 
new  points  which  lie  on  the  tracks  inside  the  window  currently  being  processed 
and  on  the  basis  of  this  fresh  information  to  reclassify  their  length,  curvature 
and  type  information. 

If  we  do  this  updating  systematically  for  each  window,  we  would  have 
compiled,  when  through  with  the  left  window,  lists  which  reflect  the  output 
information  we  set  out  to  gather.   In  this  method  or  processing,  abstraction  of 
information  from  any  specific  window  can  be  made  to  depend  upon  the  list- 
compilation  history  up  to  that  instant  and  so  search  mechanisms  can  be  optimized. 
This  optimization  capability  is  of  very  great  importance  since  in  general,  because 
of  the  window  size,  one  should  expect  the  contribution  from  the  majority  of 
windows  to  be  nil  (cf.,  for  example,  some  preliminary  statistics  listed  in  [3l)« 

The  scanning  procedure  thus  divides  naturally  into  two  alternating 
phases:   (l)  the  in-window  phase  which  is  primarily  syntactic  in  character  and 
(2)  the  cross-window  phase  which  is  primarily  administrative  in  character. 
Reflecting  this  twofold  nature  of  the  processing  scheme,  the  program  structure 
itself  is  divided  into  two  more  or  less  autonomous  parts :   an  administrative 
part  consisting  of  a  single  subprogram  called  MAIN  and  syntactic  part  consisting 
of  two  subprograms  called  LABEL  and  SEARCH. 

Of  these  three,  SEARCH  carries  out  the  actual  in-window  processing. 
Given  a  partial  track  incident  on  the  window,  it  seeks  to  answer  the  following 
types  of  questions :   does  the  track  extend  into  the  window?   if  so,  does  it 
cross  the  window  or  die  inside?   in  either  case,  what  are  the  end  point 
conditions?   if  two  tracks  meet,  do  they  meet  on  a  vertex  and  so  on.   It  seeks 
to  answer  these  questions  both  on  the  basis  of  the  currently  available  compiled 
information  transferred  to  it  by  MAIN  and  on  the  basis  of  auxiliary  information 
generated  by  the  subprogram  called  LABEL. 

LABEL,  which  operates  entirely  in  the  PAU  mode,  comprises  a  variety  of 
algorithms  to  convert  the  picture  inside  the  window  into  a  labeled  graph 
identifying  the  local  directions  associated  with  the  branches  (i.e.,  the  track 
segments)  and  classifying  the  nodes  into  vertices,  bends,  cross-overs  and 
terminals.   It  also  identifies  which  nodes  are  associated  with  which  walls  and 
which  ones  are  interior  to  the  window. 
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The  MAIN  program  is  the  principal  executive  routine  concerned  with 
all  the  administrative  details.   It  performs  all  bookkeeping  required  for 
compiling  and  updating  the  syntactic  information  supplied  to  it  by  SEARCH; 
provides  transfer  information  for  cross-window  references;  takes  care  of  the 
general  sequencing  of  the  operations  which  make  up  the  total  system. 

SEARCH  and  MAIN  communicate  through  a  fixed  set  of  information  cells 
and  a  control  cell  called  the  STATE  SWITCH  (SS).   The  sequencing  at  any  given 
stage  in  compilation  proceeds  as  follows :   the  MAIN  program  loads  the  appropriate 
input  cells  with  the  relevant  information  and  sets  the  input  state  of  SS.   SEARCH 
functions  very  much  like  a  finite  state  machine.   Depending  upon  the  input  con- 
figuration (specified  by  MAIN)  and  the  window  configuration  (prescribed  by 
LABEL),  it  sets  SS  to  a  specific  output  state  and  an  output  code .   MAIN  now 
selects  an  entry,  as  determined  by  the  total  configuration  of  SS,  from  a 
compilation  table  and  carries  out  the  compilation  as  directed  by  the  entry. 

The  shcematic  flow  chart  of  the  entire  scanning  program  is  thus  as 
given  in  Fig.  3  below. 


Start 
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MAIN  (Part  i) 


MAIN  (part  II 


Stop  if  last 
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LABEL 
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Figure  3»   Schematic  Flow  Chart  of  the  Scanning  Program 


The  following  time  estimates  for  processing  the  three  views  of  a  72-inch 
stereotriad  using  the  PAU  have  been  supplied  by  Dr.  B.  H.  McCormick. 

1.  Scanning  rate  (average)  of  digitizer:   approximately  1  bit/usec. 

2.  Time  for  processing  one  window  for  three  views  using  PAU:   approximately 
1  msec. 


3.   Scan  and  process  three  views  of  a  stereotriad: 
15  x  60  x  1  msec  ~  1  second. 


app  r  ox  ima  t  e  ly 
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2.5   Current  Status  of  the  Scanning  Program 

Of  the  three  principal  subprograms  outlined  above,  the  structure  of 
MAIN  has  been  completely  worked  out  and  its  coding  specified  at  the  macro- 
level  [1,  2],   Its  machine  language  coding  is  being  implemented  in  well-defined 
stages.   The  coding  for  the  first  stage  is  now  being  carried  out.   A  tentative 
flow  chart  for  the  first  stage  of  SEARCH  has  been  worked  out  but  its  coding 
remains  to  be  done  [k] .      A  complete  program  for  LABEL  has  been  coded  and  tested 
out  and  we  give  here  two  labeled  graphs  generated  by  this  program  [h] . 

In  Fig.  k   are  shown  the  digitized  versions  of  parts  of  two  bubble- 
chamber  negatives.   These  were  manually  prepared  from  tracings  of  enlarged  prints 
from  the  negatives.   The  pictures  as  shown  are  made  up  of  an  array  of  100  x  68 
bits  each.   These  input  pictures  were  initially  divided  into  six  windows  (each 
of  size  32  x  32  bits)  as  indicated  in  the  figure.   Each  of  these  windows  were 
processed  independently  using  LABEL  and  the  outputs  reassembled. 

Figure  5  shows  the  labeled  output  and  the  abstracted  graph  for  the 
input  picture  on  the  left  in  Fig.  3>  and  Fig.  6,  the  corresponding  outputs  for 
that  on  the  right  in  Fig.  3«   The  four  principal  labels  assigned  to  the  branches 
are  N  (for  North-South),  E  (for  East-West),  A  (for  Right-diagonal)  and  B  (for 
left  diagonal).   The  junctions,  crossings,  bends,  etc.,  where  two  or  more  roads 
meet  are  identified  by  their  multiple  labels.   In  the  outputs  these  are  given  by 
the  following  code: 


E,    A: 

3 

A,    B 

0 

E,    N,    B: 

k 

E,    N: 

5 

N,    B 

2 

A,    N,    B: 

8 

E,    B: 

9 

E,    A,    N 

7 

E,    A,    N,    B: 

U 

A,    N: 

6 

E,    A,    B 

1 

The  points  not  assigned  any  of  the  labels  are  indicated  by  asterisks  (*). 

The  graphs  on  the  right-half  of  Figs.  5  and  6  were  obtained  by  out- 
putting  a  single  representative  point  for  each  one  of  the  multiple- labeled  sets 


Each  window,  as  actually  used,  had  an  overlap  of  four  bits  with  its  top  and 
right  neighbors  and  hence  had  a  working  field  of  36  x  36  bits.   In  the 
"abstracted"  graphs  of  Figs .  5  and  6,  however,  only  the  32  x  32  windows  are 
shown. 
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and  for  each  terminal.  The  point-pairs  along  the  tracks  which  appear  on  either 
side  of  the  window  walls  provide  transfer  links  to  cross  window  boundaries. 


3.   PREPROCESSING 

As  was  remarked  earlier  in  this  paper,  it  is  envisaged  that  a  certain 
amount  of  gap  filling  and  thinning  will  have  to  he  done  on  the  digitizer  output 
in  order  to  match  it  to  the  input  requirements  of  the  scanning  program 
(specifically,  of  the  labeling  algorithms).   With  this  in  view,  several  general 
purpose  preprocessing  routines  have  been  developed.  All  of  these  operate  at 
the  purely  local  level  and  take  maximum  advantage  of  the  known  conf igurational 
features  of  bubble- chamber  pictures,  e.g.,  the  predominant  north-south  orienta- 
tion of  the  beam  tracks,  the  gaps  that  occur  along  such  beam  tracks  and  so  on. 
In  Figs.  7  and  8  are  illustrated  the  result  of  applying  one  such  preprocessing 
program  to  two  bubble- chamber  picture  segments.   The  corresponding  "noisy"  inputs 
are  shown  on  the  left  in  each  case. 
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